skip to main content


Search for: All records

Creators/Authors contains: "Ramchandran, Kannan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Large-scale online recommendation systems must facilitate the allocation of a limited number of items among competing users while learning their preferences from user feedback. As a principled way of incorporating market constraints and user incentives in the design, we consider our objectives to be two-fold: maximal social welfare with minimal instability. To maximize social welfare, our proposed framework enhances the quality of recommendations by exploring allocations that optimistically maximize the rewards. To minimize instability, a measure of users' incentives to deviate from recommended allocations, the algorithm prices the items based on a scheme derived from the Walrasian equilibria. Though it is known that these equilibria yield stable prices for markets with known user preferences, our approach accounts for the inherent uncertainty in the preferences and further ensures that the users accept their recommendations under offered prices. To the best of our knowledge, our approach is the first to integrate techniques from combinatorial bandits, optimal resource allocation, and collaborative filtering to obtain an algorithm that achieves sub-linear social welfare regret as well as sub-linear instability. Empirical studies on synthetic and real-world data also demonstrate the efficacy of our strategy compared to approaches that do not fully incorporate all these aspects. 
    more » « less
  2. In this paper, we consider hybrid parallelism—a paradigm that em- ploys both Data Parallelism (DP) and Model Parallelism (MP)—to scale distributed training of large recommendation models. We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT filters the entities to be communicated across the network through a simple hard-thresholding function, allowing only the most relevant information to pass through. For communication efficient DP, DCT compresses the parameter gradients sent to the parameter server during model synchronization. The threshold is updated only once every few thousand iterations to reduce the computational overhead of compression. For communication efficient MP, DCT incorporates a novel technique to compress the activations and gradients sent across the network during the forward and backward propagation, respectively. This is done by identifying and updating only the most relevant neurons of the neural network for each training sample in the data. We evaluate DCT on publicly available natural language processing and recommender models and datasets, as well as recommendation systems used in production at Facebook. DCT reduces communication by at least 100× and 20× during DP and MP, respectively. The algorithm has been deployed in production, and it improves end-to-end training time for a state-of-the-art industrial recommender model by 37%, without any loss in performance. 
    more » « less
  3. null (Ed.)
  4. Abstract

    Despite recent advances in high-throughput combinatorial mutagenesis assays, the number of labeled sequences available to predict molecular functions has remained small for the vastness of the sequence space combined with the ruggedness of many fitness functions. While deep neural networks (DNNs) can capture high-order epistatic interactions among the mutational sites, they tend to overfit to the small number of labeled sequences available for training. Here, we developed Epistatic Net (EN), a method for spectral regularization of DNNs that exploits evidence that epistatic interactions in many fitness functions are sparse. We built a scalable extension of EN, usable for larger sequences, which enables spectral regularization using fast sparse recovery algorithms informed by coding theory. Results on several biological landscapes show that EN consistently improves the prediction accuracy of DNNs and enables them to outperform competing models which assume other priors. EN estimates the higher-order epistatic interactions of DNNs trained on massive sequence spaces-a computational problem that otherwise takes years to solve.

     
    more » « less
  5. null (Ed.)
  6. Ranzato, M. ; Beygelzimer, A. ; Dauphin, Y. ; Liang, P.S. ; Vaughan, J. Wortman (Ed.)